1,160 research outputs found
Rethinking Recurrent Latent Variable Model for Music Composition
We present a model for capturing musical features and creating novel
sequences of music, called the Convolutional Variational Recurrent Neural
Network. To generate sequential data, the model uses an encoder-decoder
architecture with latent probabilistic connections to capture the hidden
structure of music. Using the sequence-to-sequence model, our generative model
can exploit samples from a prior distribution and generate a longer sequence of
music. We compare the performance of our proposed model with other types of
Neural Networks using the criteria of Information Rate that is implemented by
Variable Markov Oracle, a method that allows statistical characterization of
musical information dynamics and detection of motifs in a song. Our results
suggest that the proposed model has a better statistical resemblance to the
musical structure of the training data, which improves the creation of new
sequences of music in the style of the originals.Comment: Published as a conference paper at IEEE MMSP 201
Transformer Based Multi-Source Domain Adaptation
In practical machine learning settings, the data on which a model must make
predictions often come from a different distribution than the data it was
trained on. Here, we investigate the problem of unsupervised multi-source
domain adaptation, where a model is trained on labelled data from multiple
source domains and must make predictions on a domain for which no labelled data
has been seen. Prior work with CNNs and RNNs has demonstrated the benefit of
mixture of experts, where the predictions of multiple domain expert classifiers
are combined; as well as domain adversarial training, to induce a domain
agnostic representation space. Inspired by this, we investigate how such
methods can be effectively applied to large pretrained transformer models. We
find that domain adversarial training has an effect on the learned
representations of these models while having little effect on their
performance, suggesting that large transformer-based models are already
relatively robust across domains. Additionally, we show that mixture of experts
leads to significant performance improvements by comparing several variants of
mixing functions, including one novel mixture based on attention. Finally, we
demonstrate that the predictions of large pretrained transformer based domain
experts are highly homogenous, making it challenging to learn effective
functions for mixing their predictions.Comment: 12 pages, 3 figures, 5 table
Validation of the Patient Activation Measure in a Multiple Sclerosis Clinic Sample and Implications for Care
Purpose. Patient engagement in multiple sclerosis (MS) care can be challenging at times given the unpredictable disease course, wide range of symptoms, variable therapeutic response to treatment and high rates of patient depression. Patient activation, a model for conceptualising patients’ involvement in their health care, has been found useful for discerning patient differences in chronic illness management. The purpose of this study was to validate the patient activation measure (PAM-13) in an MS clinic sample.
Methods. This was a survey study of 199 MS clinic patients. Participants completed the PAM-13 along with measures of MS medication adherence, self-efficacy, depression and quality of life.
Results. Results from Rasch and correlation analyses indicate that the PAM-13 is reliable and valid for the MS population. Activation was associated with MS self-efficacy, depression and quality of life but not with self-reported medication adherence. Also, participants with relapse-remitting MS, current employment, or high levels of education were more activated than other subgroups.
Conclusions. The PAM-13 is a useful tool for understanding health behaviours in MS. The findings of this study support further clinical consideration and investigation into developing interventions to increase patient activation and improve health outcomes in MS
Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI
Artificial Intelligence (AI) is currently spearheaded by machine learning
(ML) methods such as deep learning (DL) which have accelerated progress on many
tasks thought to be out of reach of AI. These ML methods can often be compute
hungry, energy intensive, and result in significant carbon emissions, a known
driver of anthropogenic climate change. Additionally, the platforms on which ML
systems run are associated with environmental impacts including and beyond
carbon emissions. The solution lionized by both industry and the ML community
to improve the environmental sustainability of ML is to increase the efficiency
with which ML systems operate in terms of both compute and energy consumption.
In this perspective, we argue that efficiency alone is not enough to make ML as
a technology environmentally sustainable. We do so by presenting three high
level discrepancies between the effect of efficiency on the environmental
sustainability of ML when considering the many variables which it interacts
with. In doing so, we comprehensively demonstrate, at multiple levels of
granularity both technical and non-technical reasons, why efficiency is not
enough to fully remedy the environmental impacts of ML. Based on this, we
present and argue for systems thinking as a viable path towards improving the
environmental sustainability of ML holistically.Comment: 24 pages; 6 figure
Comparison of Nest Defense Behaviors of Goshawks (Accipiter gentilis) from Finland and Montana
As human impacts on wildlife have become a topic of increasing interest, studies have focused on issues such as overexploitation and habitat loss. However, little research has examined potential anthropogenic impacts on animal behavior. Understanding the degree to which human interaction may alter natural animal behavior has become increasingly important in developing effective conservation strategies. We examined two populations of northern goshawks (Accipiter gentilis) in Montana and Finland. Goshawks in Finland were not protected until the late 1980s, and prior to this protection were routinely shot, as it was believed that shooting goshawks would keep grouse populations high. In the United States, Goshawk were not managed as predator control. Though aggressive nest defense has been characterized throughout North America, goshawks in Finland do not show this same behavior. To quantify aggression, we presented nesting goshawks with an owl decoy, a human mannequin, and a live human and recorded their responses to each of the trial conditions. We evaluated the recordings for time of response, duration of response, whether or not an active stimulus was present to elicit the response (i.e., movement or sound), and the sex of the bird making the response. We used t-Test with unequal variance to compare mean number of responses and response duration. Our results suggested that goshawks in Montana exhibit more aggressive nest defense behaviors than those in Finland. While this could be due to some biotic or abiotic factor that we were not able to control for in a study on such a small scale, it is also possible that the results from this study suggest another underlying cause, such as an artificial selection pressure created by shooting goshawks
Longitudinal Citation Prediction using Temporal Graph Neural Networks
Citation count prediction is the task of predicting the number of citations a
paper has gained after a period of time. Prior work viewed this as a static
prediction task. As papers and their citations evolve over time, considering
the dynamics of the number of citations a paper will receive would seem
logical. Here, we introduce the task of sequence citation prediction, where the
goal is to accurately predict the trajectory of the number of citations a
scholarly work receives over time. We propose to view papers as a structured
network of citations, allowing us to use topological information as a learning
signal. Additionally, we learn how this dynamic citation network changes over
time and the impact of paper meta-data such as authors, venues and abstracts.
To approach the introduced task, we derive a dynamic citation network from
Semantic Scholar which spans over 42 years. We present a model which exploits
topological and temporal information using graph convolution networks paired
with sequence prediction, and compare it against multiple baselines, testing
the importance of topological and temporal information and analyzing model
performance. Our experiments show that leveraging both the temporal and
topological information greatly increases the performance of predicting
citation counts over time
- …